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We present an architecture of QCPU(Quantum Central Processing Unit) which is 
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1 Introduction 



Quantum information processing offers great advantages both for quantum 
communication and quantum computation [1,2]. The latter is implemented 
by unitary operations on a set of two-level systems known as qubits. These 
unitary operations are usually decomposed as quantum gate arrays. Depending 
on what unitary operation is desired, different gate arrays are used [3]. By 
contrast, a classical computer can be implemented as a fixed classical gate 
array-CPf/ (Central Processing Unit), into which is input a program, and data. 
The program specifies the operations to be performed on the data. CPU can be 
programmed to perform any possible function on the input data. Implementing 
different operations with different software (program) rather than different 
hardware (circuit of gates) is preferable because we could verify whether it 
has been prepared correctly or not before using the software, and we can 
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discard or repair it with little cost if it is found to be faulty. In contrast, if 
hardware fails badly during the execution of a computation, for example some 
gates in the circuit are found be set wrong or failed, it might be necessary to 
build a whole new hardware. 

The possibility to build analogous QCPU (Quantum Central Processing Unit) 
was first studied by Nielsen and Chuang [4]. The problem was originally for- 
mulated in term of a programmable array of quantum gates, which can be 
described as a fixed unitary operator G, that acts on both the program and 
the data. The initial state, Pjj , of the 'program register stores information 
about the unitary operation U that is going to be performed on a data regis- 
ter initially prepared in a state D. The total dynamics of the programmable 
quantum gate array is given by 

G(\P u )®\D)) = \Pi J )®U\D). (1) 

Nielsen and Chuang demonstrated that there does not exist deterministic uni- 
versal quantum gate array which can be programmed to perform any unitary 
operation. Recently Vidal et al. [5] and Kim et al. [6] respectively proposed 
schemes to store arbitrary one-qubit unitary operations in quantum states 
and to retrieve them with the probability p — 1 — ^r, where m is the number 
of qubits used to encode the unitary operation on one qubit. Hillery et al. 
presented a probabilistic quantum processor for qudits on a single qudit of 
dimension iV [7]. The above schemes are alike in storing unitary operations 
and retrieving them from program register precisely, but all of them are in 
a probabilistic fashion. Thus for these schemes if N gates are used in total 
computation the overall success probability is p N , which tends to fail expo- 
nentially with N. And it is not clear how to implement universal quantum 
computations by a fixed hardware. 

Either model for classical computer or for quantum computer is to be realized 
by physical system finally. However real numbers require infinite information 
(and therefore infinite energy) for their representation. Since there appears 
to be bounds on the energy of any physical system (the universe included), 
we can only approximate real numbers in computers [8]. So any model of 
computer that is realized by physical system only needs to represent finite 
accurate things and to perform computation with finite accuracy, or in other 
words discrete model of computer is sufficient. 

Nielsen and Chuang's results do not exclude the possibility of building a gate 
array which can be programmed to perform an subset of unitary operations 
[4]. In this paper, we presented an architecture of QCPU-circuit of gate array 
that can implement different quantum operations that are discrete on the 
n— qubit data register according to the IS (Instruction Sequence) input into 
the program register in a deterministic fashion. Using these discrete quantum 
gates one can approximate any unitary operation on data register. It is shown 
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that for any given computation accuracy e QCPU can be built efficiently 



2 Architecture of Deterministic and discrete Quantum Central 
Processing Unit 



The circuit of gate array for QCPU with n qubits in data register and 1 + 
m + 2(n — 1) qubits in program register is illustrated in Fig. 1. The functions 
of element gates in QCPU are explained in Fig. 2, where 1^ and /§ are f° ur 
and eight dimension identity matrix respectively, R x (k) = e 2 & ax , R z (k) = 
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Fig. 1. Architecture of QCPU. The operation implemented by the gate array in the 
dot line box corresponds to the G operation in Eq.(l). \P\j) is the program register, 
\D) is the data register. 
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Fig. 2. Functions of element gates in QCPU 
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The controlling qubit |r) indicates along which axis the rotation is imple- 
mented. When |r) is in the state |0) the rotation is along x-axis, and when 
\r) is in the state |1) the rotation is along z-axis. The qubit |afc) indicates 
whether the operation is implemented on the qubit \d\) or not. When \a,k) is 
in the state |0) the operation is not implemented, and when \a,k) is in the state 
|1) the operation is implemented. The qubit \pk) indicates which qubit is the 
control qubit for the CNOT gate on |dfc-i)|dfc). When \pk) is in the state |0) 
qubit \dk-i) is the control qubit, and when \pk) is in the state |1) qubit \dk) is 
the control qubit. The qubit \b k ) indicates whether the CNOT gate is imple- 
mented or not. When is in the state |0) the operation is not implemented, 
and when |6 fc ) is in the state |1) the operation is implemented. 

We call the list 



as Instruction, and the sequence of Instruction as IS (Instruction Sequence). 
| b n )\p n )... \bz)\pz) |^2)|p2) indicates where and which CNOT gate is to be imple- 
mented on the data register. \a m ) ...\a 2 )\ai) indicates the angle that is to rotate. 
|r) indicates which axis the rotation is along. By inputting corresponding In- 
struction into the program register, CNOTs on the data register and rotation 
along x-axis or z-axis on \d\) can be implemented by QCPU in a deterministic 
way. Thus any unitary operation on data register can be approximated by 
QCPU with the designed IS. 



3 How does the QCPU work 

Let us explain how the QCPU works. Quantum states are dominated by quan- 
tum laws, such as non-clone principle and measuring collapse. It is impossible 
to reset information of a qubit in an unknown state[9,10]. So it is not straight- 
forward for the basic operations such as inputting Instructions to program 
register. We will describe two working modes of the QCPU, which are illus- 
trated in Fig. 3 and Fig. 4. 
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Fig. 3. Working mode one, expanding computation in the time sequence. The 
decoder drives the program register to the special state according to the IS. 
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Fig. 4. Working mode two, expanding computation in the space sequence. 
Table 1 

Steps of serial processing of QCPU in time sequence. 
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The first working mode expands computation in the time sequence, and needs 
only one QCPU, and no measuring is needed during the computation. In this 
working mode we need to initialize program register to some known state, such 
as 1 00. . .0) , and to design IS(programs) before implement the computation. 
Steps of serial processing of QCPU are illustrated in Table. 3. We use the 
notation \S) to stand for program register or data register, if S is in capital 
form, or to indicate the state of it, if S is in capital form with subscript of 
number. 

To make it more clear on how the operations are encoded in IS, here we give 
some examples of IS: specifically we suppose that data register have 2 qubits, 
and m = 3(£ = |), then program register have 6 qubits l^)!^)!^)!^)^!}!?")- 



operations Instruction or IS 

37T 

R x (—): 001100 (4) 

R z {— ): 001011 (5) 
Swap(d u d 2 ) : 100000, 110000, 100000. (6) 
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The angle 9 of rotation is calculated as 



= (ai2° + a 2 2 1 + ... + a m 2 ro - 1 ) 



2tt 

2 m ' 



(7) 



where = 0, 1, (i = 1, 2, m). 

The second working mode expands computation in the space sequence, so 
many QCPUs are needed according to the number of Instructions used in IS. 
The advantage of this working mode is that it can accept the Instruction by 
teleportation that is unknown to user. 

In both working modes non-unitary operations can be implemented by per- 
forming measurement right after operating QCPU. QCPU can implement su- 
perposed unitary operations by having the program register in superposition 
state, and implement entangled unitary operations by having the program 
register in entangled state. Suppose |P) is in state \P ) + \Pi), then 



One simplest example is that the operation G is the CNOT gate. When the 
control-qubit is in superposition such as i(|0) + |1)), the unitary operations 
that implemented by the CNOT gate are superposed: NOT operation and 
identity operation are performed in a superposed way on the controlled-qubit. 
This makes the QCPU very different from the classical CPU. 

If we do not plan to utilize the benefits brought by the quantumness of the 
program register, than we may built hybrid QCPU which has the data register 
in qubits and the program register in classical bits. This hybrid QCPU is also 
capable of implementing any quantum computation on n-qubit data register, 
but only need n qubits and l + m + 2(n — 1) classical bits. 



4 The QCPU Can Be Built Efficiently To Approximate Arbitrary 
Unitary Operations To Any Given Precision 

Now let us explain why arbitrary unitary operation on the data register can be 
approximated by QCPU. It was explained above that QCPU can implement 
CNOTs on the data register, and can approximate rotations on \di) along 
x-axis and z-axis with accuracy £ = Then based on the fact that any 



G((\P ) + |Px» ® \D)) 
G(\Po) ® \D) + | Pi) <g> \D)) 
G{\P Q )®\D)) + G{\P 1 )®\D)) 
|Po) ®U Po \D) + |P : ) ®U Pl \D) 



(8) 
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single qubit operation can be expressed as at most three rotations about two 
non-parallel axes, QCPU is capable of approximating arbitrary single qubit 
operation on \di). With the help of the swap gate which can be accomplished 
by three CNOTs, QCPU can approximate any single qubit operation on any 
qubit of the data register. We have known that CNOTs and arbitrary single 
qubit operations together can complete universal quantum computation [3]. 
Therefore QCPU proposed above can approximate arbitrary unitary operation 
on the data register. 

The error caused by approximate rotation rather than precision rotation is 
estimated below. Angles denoted by Eq.(7) symmetrically distribute between 
[0, 27r], and are implemented by QCPU precisely. So the error that QCPU 
approximates an arbitrary angle is less than All computation gives outputs 
at last. The approximation of the unitary transformations will lead to errors 
in the resultant output states. The errors of the outputs, which are from the 
errors of 9, are calculated below. 

Rotation along a- axis can be expressed as U a (9) = e l6aa , where a a is a Pauli 
matrix, a — x,y, z. Then 

5U a (9) = ia a e ie ^59. 

The variation of the output state U a (9)\tp) reads 

5(U a (9M) = 5U a (9)\iP). 

The variation of the probability \{(p\U a (9)\ijj)\ 2 of projecting U a (6)\ip) onto a 
state \<p) reads 

8P=(<p\8U a {0my>\Ul{0)\<p) 

+(<p\u a (e)\ti>)(ii>\6ul(e)\tp), 

So we have \SP\ < 2\59\. 

It is shown that when the output state U a {9)\ip) are projected on the state 
\ifi), the error caused the approximate rotation is linear with the accuracy of 
9. When there are N such unitary transformations U = U(9i)U(9 2 )...U(9n) 
before the measurement, we have 

\SP\ <2(|<?0 1 | + |<?0 2 | + ... + |<?0 JV |) (9) 

Thus in order to approximate a computation that is implemented by N ap- 
proximate operations to an overall accuracy e, each operation only needs to 
be accurate to e/N [11]. 
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The number of qubits, which is needed in the program register to approximate 
any computation on n-qubit data register to accuracy e, is calculated below. 
We know that any 2 n x 2 n unitary operation can be implemented by 0(n 3 4 n ) 
CNOTs and single qubit operations [3] . Each CNOT or single qubit operation 
can be implement by at most 3n Instructions in QCPU. So the number of 
Instructions needed to implement arbitrary operation on n qubits is 0(n 4 4 ra ). 
If the overall computation precision is e then each Instruction need to be 
accurate to o{ej (n 4 4 n )). Rotation implemented by single Instruction in QCPU 
is accurate to about So 

n 4 4 n 

m = 0(log 2 ) 

e 

= 0(41og 2 n + 2n-lne) (10) 

The number of qubits needed in program register is 1 + 0(4 log 2 n+2n — Ine) + 
2{n — 1) = 0{n) — \ne. It is linear with the number of data qubits and lne, 
therefore QCPU can be built efficiently to approximate universal quantum 
computation on any number of qubits to any given computation accuracy e. 

QCPU enables one to realize quantum computations with different software 
rather than different hardwareiciicnit of gates). A quantum software is a se- 
quence of particular programmed IS that makes a QCPU to perform a specific 
task. The notation of quantum software was first used by Preskill [12]. He used 
it in a little different way. But the virtue of quantum software itself is alike. 
Realizing quantum computation with software rather than hardware is prefer- 
able for a lot of reasons. If the IS are in quantum states, QCPU in working 
mode two is capable of implementing quantum remote control introduced by 
Huelga et al. in [13,14]. One important virtue of quantum software will be to 
ensure that quantum computers function reliably. Because quantum states are 
very fragile, the quantum-computing hardware needs to meet very demanding 
specifications. Quantum computer can achieve acceptable reliability by apply- 
ing principles of quantum-error correction [15,16]. Quantum-error correction 
schemes would be most conveniently implemented with quantum software. 
Some time in the future the fault-tolerant quantum computer, which could 
possibly be dominated by quantum software, may achieve processing speeds 
far surpassing the classical computers. 



5 Conclusion 

To summarize, we have presented an architecture of QCPU which operates 
on a n-qubit data register and is capable of completing any unitary operation 
with accuracy e in a deterministic way. It contains 0(n) — Ine three-qubit gates 
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and (n — 1) four-qubit gates and needs 0(n) — hie qubits as its input. There- 
fore it can be built efficiently. Each gate only concerns at most four qubits, 
this may be appreciated in real implementations of the quantum computer. 
We have described two working modes of QCPU. QCPU have the ability to 
implement superposed and entangled unitary operations on the data register, 
which is shown by Eq.(8). This ability could help us to implement more effi- 
cient algorithm-the number of IS needed to implemented a algorithm might 
be much smaller than the upper bounds 0(n 4 4 n ). One noteworthy quality of 
our architecture is that it can approximate n-qubit universal quantum com- 
putation with only n qubit plus 0{n) — hie classical bits as its input. So with 
sufficiently sophisticated hardware QCPU could run on classical software but 
implement the quantum computation. 

QCPU made it possible to put the solution of the problem into software rather 
than hardware. This is an important step to the general quantum computer 
because it is impracticable to build a special hardware for each problem. In 
the hardware of quantum computer qubits must preserve coherence during 
operations. Thus the scale (the number of qubits in quantum hardware) and 
the the operating time of quantum computer have a lot of limitations. When 
the space complexity(scale) or time complexity(time) is beyond the capability 
of the hardware, programming technique in software may help us to make the 
balance of space complexity and time complexity. This possibly enlarges our 
computation ability under specifical hardware technique. Then like the art 
of programming in classical computer, the art of programming in quantum 
computer will settle various kinds of problem with a fixed and general purpose 
hardware, and the art of programming in quantum computer will also improve 
the efficiency of algorithms by just refining the program(IS). 
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by the National Nature Science Foundation of China (Grants No. 10075041, 
No. 10075044 and No. 10104014), and the National Fundamental Research 
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